Incorporating Linguistic Knowledge in Statistical Machine Translation: Translating Prepositions
نویسندگان
چکیده
Prepositions are hard to translate, because their meaning is often vague, and the choice of the correct preposition is often arbitrary. At the same time, making the correct choice is often critical to the coherence of the output text. In the context of statistical machine translation, this difficulty is enhanced due to the possible long distance between the preposition and the head it modifies, as opposed to the local nature of standard language models. In this work we use monolingual language resources to determine the set of prepositions that are most likely to occur with each verb. We use this information in a transfer-based Arabic-to-Hebrew statistical machine translation system. We show that incorporating linguistic knowledge on the distribution of prepositions significantly improves the translation quality.
منابع مشابه
Improving Translation to Morphologically Rich Languages (Améliorer la traduction des langages morphologiquement riches) [in French]
Améliorer la traduction des langages morphologiquement riches While statistical techniques for machine translation have made significant progress in the last 20 years, results for translating to morphologically rich languages are still mixed versus previous generation rule-based systems. Current research in statistical techniques for translating to morphologically rich languages varies greatly ...
متن کاملA System for Translating Locative Prepositions from English into French
Machine translation of locative prepositions is not straightforward, even between closely related languages. This paper discusses a system of translation of locative prepositions between English and French. The system is based on the premises that English and French do not always conceptualize objects in the same way, and that this accounts for the major differences in the ways that locative pr...
متن کاملUse of Rich Linguistic Information to Translate Prepositions and Grammatical Cases to Basque
This paper presents three successful techniques to translate prepositions heading verbal complements by means of rich linguistic information, in the context of a rule-based Machine Translation system for an agglutinative language with scarce resources. This information comes in the form of lexicalized syntactic dependency triples, verb subcategorization and manually coded selection rules based ...
متن کاملLinguistically-Enriched Models for Bulgarian-to-English Machine Translation
In this paper, we present our linguisticallyenriched Bulgarian-to-English statistical machine translation model, which takes a statistical machine translation (SMT) system as backbone various linguistic features as factors. The motivation is to take advantages of both the robustness of the SMT system and the rich linguistic knowledge from morphological analysis as well as the hand-crafted gramm...
متن کاملEnriching Morphologically Poor Languages for Statistical Machine Translation
We address the problem of translating from morphologically poor to morphologically rich languages by adding per-word linguistic information to the source language. We use the syntax of the source sentence to extract information for noun cases and verb persons and annotate the corresponding words accordingly. In experiments, we show improved performance for translating from English into Greek an...
متن کامل